rank | frequency | n-gram |
---|---|---|
1 | 345446 | -n |
2 | 320312 | -s |
3 | 307509 | -a |
4 | 198853 | -r |
5 | 172511 | -t |
rank | frequency | n-gram |
---|---|---|
1 | 207789 | -en |
2 | 106988 | -er |
3 | 88158 | -na |
4 | 84696 | -et |
5 | 60603 | -us |
rank | frequency | n-gram |
---|---|---|
1 | 59524 | -rna |
2 | 44230 | -ing |
3 | 35479 | -gen |
4 | 29297 | -ten |
5 | 28750 | -ens |
rank | frequency | n-gram |
---|---|---|
1 | 30930 | -erna |
2 | 25424 | -ning |
3 | 24248 | -ngen |
4 | 21327 | -arna |
5 | 15014 | -ngar |
rank | frequency | n-gram |
---|---|---|
1 | 20988 | -ingen |
2 | 14316 | -ensis |
3 | 13944 | -ingar |
4 | 8458 | -ionen |
5 | 7378 | -terna |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings